Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

نویسندگان

  • Ivan Vulic
  • Nikola Mrksic
  • Anna Korhonen
چکیده

Existing approaches to automatic VerbNetstyle verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages. To the best of our knowledge, this is the first study which demonstrates how the architectures for learning word embeddings can be applied to this challenging syntactic-semantic task. Our method uses cross-lingual translation pairs to tie each of the six target languages into a bilingual vector space with English, jointly specialising the representations to encode the relational information from English VerbNet. A standard clustering algorithm is then run on top of the VerbNet-specialised representations, using vector dimensions as features for learning verb classes. Our results show that the proposed cross-lingual transfer approach sets new state-of-the-art verb classification performance across all six target languages explored in this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Syntactically Informed Distributed Word Representations

We develop a novel cross-lingual word representation model which injects syntactic information through dependencybased contexts into a shared cross-lingual word vector space. The model, termed CLDEPEMB, is based on the following assumptions: (1) dependency relations are largely language-independent, at least for related languages and prominent dependency links such as direct objects, as evidenc...

متن کامل

Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

We present ATTRACT-REPEL, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. ATTRACT-REPEL facilitates the use of constraints from monoand crosslingual resources, yielding semantically specialised cross-lingual vector spaces. Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct h...

متن کامل

Two modes of assessment: the case of academicians' writing

This study attempted to investigate writing problems and the relationship between expert-assessment and self-assessment of writing problems. Participants were thirty four non-English faculty members of Tehran and Guilan universities. The instruments were writing an essay on the topic "What teaching strategies do you use in your classes?" in twenty five lines and filling the questionnaire of wri...

متن کامل

Document Representation with Statistical Word Senses in Cross-Lingual Document Clustering

Cross-lingual document clustering is the task of automatically organizing a large collection of multi-lingual documents into a few clusters, depending on their content or topic. It is well known that language barrier and translation ambiguity are two challenging issues for cross-lingual document representation. To this end, we propose to represent cross-lingual documents through statistical wor...

متن کامل

Classification of transformer faults using frequency response analysis based on cross-correlation technique and support vector machine

One of the most important methods for transformers fault diagnosis (especially mechanical defects) is the frequency response analysis (FRA) method. The most important step in the FRA diagnostic process is to differentiate the faults and classify them in different classes. This paper uses the intelligent support vector machine (SVM) method to classify transformer faults. For this purpose, two gr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017